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ffiffiBBffiBESSIga OF MMMamui Mffi TTrni, EBffiEEIB 

fjgld Of the Tm/o ntlT > n 
s m 7^ inVOntlon co ««e«i S genes and methods for 
5 expressing eukaryotic and viral proteins at high levels 
in eukaryotic cells. *eveis 



Background of rh „ T nv^j ?n 
Expression of eukaryotic gene products in 
prokaryotes is sometimes limited by the presence of 
10 codons that are infrequently used in E. coli. Expression 

of ZTZT ~ ^ * S ^ atic substitution 

of the endogenous codons with codons overrepresented in 
hxghly expressed prokaryotic genes (Robinson et al. 
is Z I' " ±S COmi0nly su PP°«ed that rare codons cause 
15 pausing of the ribosome, which leads to a failure to 
complete the nascent polypeptide chain and a uncoupling 

stalT:l Ptl ° n r tranSlati - ™° -* 3^ end'of L 
stalled rxbosome is exposed to cellular ribonucleases, 
^ which decreases the stability of the transcript. 

Summary of th» TnY ^ nt j 7n 

orotei™ 6 in ; enti ° n featUres a synthetic gene encoding a 
Protexn normally expressed in kalian cells wherein at 
least one non-preferred or less preferred codon in the 
natural gene encoding the mammalian protein has been 
replaced by a preferred codon encoding the sai „e amino 
acid. 

raao- r 6 ?" 6 " COd ° nS (9CC,; *** < C 9C); Asn 

aac), Asp (gac) Cys (tgc) ; Gin (cag) ; Gly (gg C ) ; His 

30 HIT l le (atc)? ^ (ct9); Lys (aag); »*» — 

30 ttc); ser (age); Thr (acc) ; Tyr (tac) ; and Val (gtg) 
Less preferred codons are: Gly (ggg, - Ile (att) . Leu 
(etc); ser (tec, ; Val (gtc) . All codons whieh do not fit 
the description of preferred codons or less preferred 
codons are non-preferred codons. 
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By protein normally expressed in mammalian cells 
is meant a protein which is expressed in mammalian under 
natural conditions. The term includes genes in the 
mammalian genome such as Factor VIII, Factor IX, 
5 interleukins, and other proteins. The term also includes 
genes which are expressed in a mammalian cell under 
disease conditions such as oncogenes as well as genes 
which are encoded by a virus (including a retrovirus) 
which are expressed in mammalian cells post-infection 

10 In preferred embodiments, the synthetic gene is 

capable of expressing said mammalian protein at a level 
which is at least 110%, 150%, 200%, 500%, 1,000%, or 
10,000% of that expressed by said natural gene in an in 
vitro mammalian cell culture system under identical 

15 conditions (i.e., same cell type, same culture 
conditions, same expression vector) . 

Suitable cell culture systems for measuring 
expression of the synthetic gene and corresponding 
natural gene are described below. Other suitable 

20 expression systems employing mammalian cells are well 
known to those skilled in the art and are described in, 
for example, the standard molecular biology reference 
works noted below. Vectors suitable for expressing the 
synthetic and natural genes are described below and in 

25 the standard reference works described below. By 

"expression" is meant protein expression. Expression can 
be measured using an antibody specific for the protein of 
interest. Such antibodies and measurement techniques are 
well known to those skilled in the art. By "natural 

30 gene" is meant the gene sequence which naturally encodes 
the protein. 

In other preferred embodiments at least 10%, 20%, 
30%, 40%, 50%, 60%, 70%, 80%, or 90% of the codons in the 
natural gene are non-preferred codons. 
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preferred embodiment «■/ ! , CVen nore 

. ™»r, tvrrc .'rr • -°~ - — <- 

•*« rather than . preferred ™ ' P«'.«.d 
By -vector- i. meant a DNA .oLcu!.. derive* 

err. ;=.rr„r— : 
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DNA expression vectors include mammalian plasmids and 
viruses . 

The invention also features synthetic gene 
fragments which encode a desired portion of the protein. 
5 Such synthetic gene fragments are similar to the 

synthetic genes of the invention except that they encode 
only a portion of the protein. Such gene fragments 
preferably encode at least 50, 100, 150, or 500 
contiguous amino acids of the protein. 
10 In constructing the synthetic genes of the 

invention it may be desirable to avoid CpG sequences as 
these sequences may cause gene silencing. 

The codon bias present in the HIV gpi20 envelope 
gene is also present in the gag and pol proteins. Thus, 
15 replacement of a portion of the non-preferred and less 
preferred codons found in these genes with preferred 
codons should produce a gene capable of higher level 
expression. A large fraction of the codons in the human 
genes encoding Factor VIII and Factor IX are non- 
20 preferred codons or less preferred codons. Replacement 
of a portion of these codons with preferred codons should 
yield genes capable of higher level expression in 
mammalian cell culture. Conversely, it may be desirable 
to replace preferred codons in a naturally occurring gene 
2 5 with less-preferred codons as a means of lowering 
expression. 

Standard reference works describing the general 
principles of recombinant DNA technology include Watson, 
J.D. et al., Molecular Bioloov of the Gene. Volumes I and 

30 II, the Benjamin/ Cummings Publishing Company, Inc., 

publisher, Menlo Park, CA (1987); Darnell, J.E. et al., 
Molecular Cell Bioloov . Scientific American Books, Inc., 
Publisher, New York, N.Y. (1986); Old, R.W., et al., 
Principles of. Gene Manipulation; An Introduction to 

35 Genetic Engineering . 2d edition. University of California 
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Tr'noT'^- Berkel * y ' a <"»><• "'"iatis. T . et 



(1989) . 



- J Tt^U 

Wiley Press, New York, NY 

Details p f ^ rrlT7t1pn 
Description Q f » hf r r[ w 1m i 



9P120 (HIvT™, 1S 3 SChe " atlC dr3Win ' ° f "» S *"hetic 

15 h ,SM " Shad " d POrti ° nS Mrltea « to 

V5 indict, hypervariable regions. The fined box 

unique restriction sites ares shown: H (Hind3) Kh 
«•»). P <m«. Ha ,K.ei,. „ (M1U1) , * f Ji ^ 
Agel, and No <Notl, . The chemically synthesized d»a 
20 fragments which served as pcr *-»,i„i r n ™««ed dna 
fh. __,,„ templates are shown below 

the gpijo sequence. .Ion, with the locations o, the 
primers used for their amplification. 

Figure 3 is a photograph of the results of 
transient transfection asssys used to measure gp l20 
25 expression. G .l electrophoresis of immunopr.clpitated 
supernatants of 293T ceils transfected vitn p^mids 
expressing gp 120 encoded by the I IIB isolate of alv-l 
(gpl20 IIIb) . by the MK isolate IWN , by the "™ 

30 ZtT ^ SUbS " tU "o" «* the endogenous leader 

30 Peptide with that of the CDS antigen <gpl 20 mnc D 5 L , . or Jy 

with e r 117 * ynthMi "' i — — the MH variant 
with the human CDSLeader (syngpi 20m „ . supernatants were 
harvested following a 12 hour labeling per iod 60 hours 
post-transfection and immunoprecipitated with €04:1,01 
35 fusion protein and protein A sepbarose. 
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Figure 4 is a graph depicting the results of ELISA 
assays used to measure protein levels in supernatants of 
transiently transfected 293T cells. Supernatants of 293T 
cells transfected with plasmids expressing gpl20 encoded 
5 by the IIIB isolate of HIV-1 (gpl20 Illb) , by the MN 
isolate (gpl20mn) , by the MN isolate modified by 
substitution of the endogenous leader peptide with that 
of CD5 antigen (gpl20mn CD5L) , or by the chemically 
synthesized gene encoding the MN variant with human CDS 

10 leader (syngpl20mn) were harvested after 4 days and 
tested in a gpl20/CD4 ELISA- The level of gpl20 is 
expressed in ng/ml. 

Figure 5, panel A is a photograph of a gel 
illustrating the results of a immunoprecipitation assay 

15 used to measure expression of the native and synthetic 
gpl20 in the presence of rev in trans and the RRE in cis. 
In this experiment 293T cells were transiently 
transfected by calcium phosphate coprecipitation of 10 ng 
of plasmid expressing: (A) the synthetic gpl20MN seguence 

20 and RRE in cis, (B) the gpl20 portion of HIV-l IIIB, (C) 
the gpl20 portion of HIV-l IIIB and RRE in cis, all in 
the presence or absence of rev expression. The RRE 
constructs gpl20IIIbRRE and syngpl20mnRRE were generated 
using an Eagl/Hpal RRE fragment cloned by PCR from a 

25 HIV-l HXB2 proviral clone. Each gpl20 expression plasmid 
was cotransfected with 10 fig of either pCMVrev or CDM7 
plasmid DNA. Supernatants were harvested 60 hours post 
transfection, immunoprecipitated with CD4:IgC fusion 
protein and protein A agarose, and run on a 7% reducing 

30 SDS-PAGE. The gel exposure time was extended to allow the 
induction of gpl20IIIbrre by rev to be demonstrated. 
Figure 5, panel B is a shorter exposure of a similar 
experiment in which syngpl20mnrre was cotransfected with 
or without pCMVrev. Figure 5, panel C is a schematic 

35 diagram of the constructs used in panel A. 
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rr ie rat tm - 1 — ^ r a 

constructed by chemical synthesis and h 
» Prevalent codons t cu„d in "the M-llT^. 

construction are shown at the ootto* VthV^rV 

cyto^Tn^s// «" ~£ - «~ 

y analysis, m this experiment 293T c .n a 

T^:T n :::::^T^ — — - ~ -i 

or vector only IZL^ZTZZ IT' 
ratTHV-i sonoclonal ZTZ^ Ml t£? ^ 

" illu.tr t^* *• Panel * 18 ' «*«*•»*•» of a ,.1 

£=rr= :.■==— r- 

region of the syngpi20mn gene (B) . The 
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Notl site of the syngpl20mn plasmid and tested for 
correct orientation. Supernatants of 35S labelled cells 
were harvested 72 hours post transf ection, precipitated 
with CD4:lgG fusion protein and protein A agarose, and 
5 run on a 7% reducing SDS-PAGE. Figure 9, panel B is a 
schematic diagram of the constructs used in the 
experiment depicted in panel A of this figure. 

Description of the Preferred Embodiments 

Construction Qf a Synthetic gpi20 Gene Having eodons 

10 Found in Highly Expressed Human Ggn** 

A codon frequency table for the envelope precursor 
of the LAV subtype of HIV-i was generated using software 
developed by the University of Wisconsin Genetics 
Computer Group. The results of that tabulation are 

15 contrasted in Table 1 with the pattern of codon usage by 
a collection of highly expressed human genes. For any 
amino acid encoded by degenerate codons, the most favored 
codon of the highly expressed genes is different from the 
most favored codon of the HIV envelope precursor. 

20 Moreover a simple rule describes the pattern of favored 
envelope codons wherever it applies: preferred codons 
maximize the number of 

adenine residues in the viral RNA. In all cases but one 
this means that the codon in which the third position is 

25 A is the most frequently used. In the special case of 
serine, three codons equally contribute one A residue to 
the mRNA; together these three comprise 85% of the codons 
actually used in envelope transcripts. A particularly 
striking example of the A bias is found in the codon 

30 choice for arginine, in which the AGA triplet comprises 
88% of all codons. In addition to the preponderance of A 
residues, a marked preference is seen for uridine among 
degenerate codons whose third residue must be a 
pyrimidine. Finally, the inconsistencies among the less 
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10 



frequently used variants k„ 
underrepresented; thus the 

to he c .enever the sJ^tllZ i Tc * ^ 
codons for alanine, p ro i ine s^in ! ' " ^ 
the CGX triplets for I«H V threonin ^ *"d 

TABLE I- Codo „ ' 9 a " USed at all. 

X * Co ^on Frequency in the Hiv-i tttk 

and in highly expressed nlan "enes™ 9ene 

Ala Hlgh _ HighBnv 

GC C 53 27 228 

T 17 18 TG C 68 16 

A 13 50 T 32 84 



15 G i 7 5 ^ 

to 

CG 



AG 



c 


37 


0 


T 


7 


4 


A 


6 


0 


G 


21 


0 


A 


10 


88 


G 


18 


8 


C 


78 


30 


T 


22 


70 


C 


75 


33 


T 


25 


67 



Asp 



Ley 

35 CT C 26 ,n |«I 

TC 



TT 

40 

Lya 

AA A lfl co 



c 


26 


10 


T 


5 


7 


A 


3 


17 


G 


58 


17 


A 


2 


30 


G 


6 


20 


A 


18 


68 


G 


82 


32 



CA 


A 


12 


55 




G 


88 


45 


Qly 








GA 


A 


25 


67 




G 


75 


33 


six 








GG 


C 


50 


6 




T 


12 


13 




A 


14 


53 




G 


24 


28 


HiS 








CA 


C 


79 


25 




T 


21 


75 


US 








AT 


C 


77 


25 




T 


18 


31 




A 


5 


44 



C 


28 


8 


T 


13 


8 


A 


5 


22 


G 


9 


0 


C 


34 


22 


T 


10 


41 



AC C 57 20 

T 14 22 

A 14 51 

G 15 7 
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cc 


C 


48 


27 




T 


19 


14 




A 


16 


55 


5 


G 


17 


5 










TT 


C 


80 


26 




T 


20 


74 











TA 


C 


74 


8 




T 


26 


92 


Zftl 








GT 


C 


25 


12 




T 


7 


9 




A 


5 


62 




G 


64 


18 



Codon frequency was calculated using the GCG program 

established the the University of Wisconsin Genetics 
computer Group. Numbers represent the percentage of 
15 cases in which the particular codon is used. Codon usage 
frequencies of envelope genes of other HIV-1 virus 
isolates are comparable and show a similar bias. 



In order to produce a gpi20 gene capable of high 

20 level expression in mammalian cells, a synthetic gene 
encoding the gpl20 segment of HIV-1 was constructed 
(syngpl20mn) , based on the sequence of the most common 
North American subtype, HIV-1 MN (Shaw et al. 1984; Gallo 
et al. 1986). In this synthetic gpl20 gene nearly all of 

25 the native codons have been systematically replaced with 
codons most frequently used in highly expressed human 
genes (FIG. 1) . This synthetic gene was assembled from 
chemically synthesized oligonucleotides of 150 to 200 
bases in length. If oligonucleotides exceeding 120 to 

30 150 bases are chemically synthesized, the percentage of 
full-length product can be low, and the vast excess of 
material consists of shorter oligonucleotides. Since 
these shorter fragments inhibit cloning and PCR 
procedures, it can be very difficult to use 

35 oligonucleotides exceeding a certain length. In order to 
use crude synthesis material without prior purification, 
single-stranded oligonucleotide pools were PCR amplified 
before cloning. PCR products were purified in agarose 
gels and used as templates in the next PCR step. Two 
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adjacent fragments could be co-aapii £ie d 
overlapping sequences at the end !f eit L f °' 
These fragments, which were between itT * 

«- «>ole gpi 20 gene was assayed For . '°"! '"V"^ 
in FIG. i. ' »PP'«<*) is presented 

« io„, (UP to 30 „ ucleotide : r :. 1 tt 1 *' ana 

was necessary to exchange parts with eith " 
adapters or pieces fro. otter suLT s ""thetic 
in that particular reoTo! . SUbCl ° nes with °>" "Utation 
—nee to ZZlZ^T^ITZ.T ~ 
2= accomodate the introduction or restriction ,i t ■ 
the resulting gene to facilitat. Z. l , M ° 
various segnents (FIG J T "placement of 

«gments (FIG. 2). These unique restrin-i™ 

with the highl, S^Tx a" r VpTdTof"" T"~ 
antigen to facilitate secretion. ^1'^ ^ 
construction is a derivative of the .L „ 1 ( 
vector pCD„ 7 transcrihin, the inserted gen. „nd.Ttte " 
control of a strong hunan CMV imediate'earl p™^ 
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To compare the wild-type and synthetic gpl20 
coding sequences, the synthetic gpl20 coding sequence was 
inserted into a mammalian expression vector and tested in 
transient transfection assays. Several different native 
5 gpl20 genes were used as controls to exclude variations 
in expression levels between different virus isolates and 
artifacts induced by distinct leader sequences. The 
gpl20 HIV I lib construct used as control was generated by 
PCR using a Sall/Xhol HIV-1 HXB2 envelope fragment as 

10 template. To exclude PCR induced mutations a Kpnl/Earl 
fragment containing approximately 1.2 kb of the gene was 
exchanged with the respective sequence from the proviral 
clone. The wildtype gpl20mn constructs used as controls 
were cloned by PCR from HIV-1 MN infected C8166 cells 

15 (AIDS Repository, Rockville, MD) and expressed gpl20 
either with a native envelope or a CD 5 leader sequence. 
Since proviral clones were not available in" this case, 
two clones of each construct were tested to avoid PCR 
artifacts. To determine the amount of secreted gpl20 

20 semi-quantitatively supernatants of 293T cells 
transiently transfected by calcium phosphate 
coprecipitation were immunoprecipitated with soluble 
CD4: immunoglobulin fusion protein and protein A 
sepharose. 

25 The results of this analysis (FIG. 3) show that 

the synthetic gene product is expressed at a very high 
level compared to that of the native gpl20 controls. The 
molecular weight of the synthetic gpl20 gene was 
comparable to control proteins (FIG. 3) and appeared to 

30 be in the range of 100 to 110 kd. The slightly faster 
migration can be explained by the fact that in some tumor 
cell lines like 29 3T glycosylation is either not complete 
or altered to some extent. 

To compare expression more accurately gpl20 

35 protein levels were quantitated using a gpl20 ELISA with 
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CD4 in th. demobilirea phasa . Thi , aM 

4) that ELISA data war. comparable to th. ' 

immunoprecipitation data, with « 

shown the increase was at least 25 fold 

10 The R P in of rev tn m „» r i rr -r-lim 

Since rev appears to exert its effect at several 
steps in the expression o, a viral transcript, th. 
possible role of non-translational effects in the 
15 testeT :* PreSSl ° n <*« ^"hetic gpl20 gene was 

20 ITl ^"Plssmic RNA was prepared by MM 0 ly, is 

e i !" ntly tM «-*- »» "lis and subse^ent 

I ^ " UClel ^ «" r »^«ion. Cytoplasmic 
■» was subsequently prepared fro. lysates by multiple 
Phenol extractions and precipitation, spotted on 

25 hvb r °r U , UlOSe USin ' * Slt " bl0t snd finally 

25 hybridize with an envelope-specific probe. 

with rn-? ie " y ' Cyt0plaSBi '= ■»» =>" "Us transfected 
With COM4, gpl20 IIIB , or sy „ gpl20 „, ^ 

post transfaction. cytoplasmic rna of Hela calls 

30 vlT^ VlthWUdt «» — virus or r.combinant 

was unaeT^ 9 ^ * S ™ iC 

was under th. control of the 7.5 promoter was isolated 16 

hours post infection. Equal amounts were spotted on 

nitrocellulose using a slot blot device and hybridized 

with randomly labelled 1.5 leb gpl20IIlb and syngpi 2 o 

35 fragments or human bata-actin. RNA expression l.v.ls 
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were quantitated by scanning the hybridized membranes 
with a phospoimager. The procedures used are described 
in greater detail below. 

This experiment demonstrated that there was no 
5 significant difference in the mRNA levels of cells 
transfected with either the native or synthetic gpi20 
gene. In fact, in some experiments cytoplasmic mRNA 
level of the synthetic gpi20 gene was even lower than 
that of the native gpl20 gene. 

10 These data were confirmed by measuring expression 

from recombinant vaccinia viruses. Human 293 cells or 
Hela cells were infected with vaccinia virus expressing 
wildtype gpl20 Illb or syngpl20mn at a multiplicity of 
infection of at least 10. Supernatants were harvested 24 

15 hours post infection and immunoprecipitated with 

CD4 : immunoglobin fusion protein and protein A sepharose. 
The procedures used in this experiment are described in 
greater detail below. 

This experiment showed that the increased 

20 expression of the synthetic gene was still observed when 
the endogenous gene product and the synthetic gene 
product were expressed from vaccinia virus recombinants 
under the control of the strong mixed early and late 7.5k 
promoter. Because vaccinia virus mRNAs are transcribed 

25 and translated in the cytoplasm, increased expression of 
the synthetic envelope gene in this experiment cannot be 
attributed to improved export from the nucleus. This 
experiment was repeated in two additional human cell 
types, the kidney cancer cell line 293 and HeLa cells. 

30 As with transfected 293T cells, mRNA levels were similar 
in 293 cells infected with either recombinant vaccinia 
virus. 



WO 96/09378 



PCT/US95/11511 



- 15 

toe in LgPfcjgisug 



.1- i« BSCaUSe " aPPea " that Codon u «ge has a 

x. ««. A codon frequency r - r.;:.^- 1 ~ 
» r: :r t :i u ;T; of tour »« ^ 

virus, ana visna virus, ! «ici«oy 

P«te„ for lentivirus.s Is strivinel 

Hiv-i i„ „, strikingly similar to that of 

20 "v. 'is the s, CMeS ° ne - »*- 'or 

also refits a sio r^*""' ' SltUa "° n *** 
as en 3 i„ g "T The !" Pr ° f ™ «* «- triplet 

lentivirai 1 * " " C ° d ° n US " e b * «*• — 

-a. tt ira P ositi r :„ si r::; rj-::\:vi: k zi 

different codons more equally a nattem %k w 

i . . y iy ' a Pattern they share with 

less highly expressed human genes. 



WO 96/09378 



PCT/US9S/11S11 



TABLE 2: Codon frequency in the envelope gene of 
lentiviruses (lenti) and non-lentiviral 
retroviruses (other) . 
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30 Position for alanZ ' ' *° the thi « 

triplet, is rarely g ' ^ ' th " OTi "« 
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«* Cpo. The -o. TllZs IZT"- Und *^— .tion 
ana other retrovirus^ with """^ len «viruses 

3= iies i„ the usa,eT«. c 0x r r PeCt t0 ^ 
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expression of both native and synthetic gene was 
investigated. Since regulation by rev requires the rev- 
binding site RRE in cis, constructs were made in which 
this binding site was cloned into the 3' untranslated 
5 region of both the native and the synthetic gene. These 
plasmids were co-transfected with rev or a control 
plasmid in trans into 293T cells, and gpl20 expression 
levels in supernatants were measured semiquantitatively 
by immunoprecipitation. The procedures used in this 

10 experiment are described in greater detail below. 

As shown in FIG. 5, panels A and B, rev 
upregulates the native gpl20 gene, but has no effect on 
the expression of the synthetic gpi20 gene. Thus, the 
action of rev is not apparent on a substrate which lacks 

15 the coding sequence of endogenous viral envelope 
sequences . 

Expression of a synthetic rat THY-l gene with HIV 

envelope codons 

The above-described experiment suggest that in 

20 fact "envelope sequences" have to be present for rev 
regulation. In order to test this hypothesis, a 
synthetic version of the gene encoding the small, 
typically highly expressed cell surface protein, rat 
THY-l antigen, was prepared. The synthetic version of 

25 the rat THY-l gene was designed to have a codon usage 

like that of HIV gpl20. In designing this synthetic gene 
AUUUA sequences, which are associated with mRNA 
instability, were avoided. In addition, two restriction 
sites were introduced to simplify manipulation of the 

30 resulting gene (FIG. 6) . This synthetic gene with the 
HIV envelope codon usage (rTHY-lenv) was generated using 
three 150 to 170 mer oligonucleotides (FIG. 7) . In 
contrast to the syngpl20mn gene, PCR products were 
directly cloned and assembled in pUCl2 , and subsequently 

35 cloned into pC0M7. 
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responsiveness of the a rat THY-ienv construct having a 
3' RRE, human 293T cells were cotransfected 
ratTHY-lenvrre and either CDM7 or pCMVrev. At 60 hours 
post transfection cells were detached with 1 mM EDTA in 
5 PBS and stained with the OX-7 anti rTHY-l mouse 
monoclonal antibody and a secondary FITC-conjugated 
antibody. Fluorescence intensity was measured using a 
EPICS XL cytofluorometer. These procedures are described 
in greater detail below. 
10 In repeated experiments, a slight increase of 

rTHY-lenv expression was detected if rev was 
cotransfected with the rTHY-lenv gene. To further 
increase the sensitivity of the assay system a construct 
expressing a secreted version of rTHY-lenv was generated. 
15 This construct should produce more reliable data because 
the accumulated amount of secreted protein in the 
supernatant reflects the result of protein production 
over an extended period, in contrast to surface expressed 
protein, which appears to more closely reflect the 
20 current production rate. A gene capable of expressing a 
secreted form was prepared by PCR using forward and 
reverse primers annealing 3' of the endogenous leader 
sequence and 5' of the sequence motif required for 
phosphatidylinositol glycan anchorage respectively. The 
25 PCR product was cloned into a plasmid which already 
contained a CD5 leader sequence, thus generating a 
construct in which the membrane anchor has been deleted 
and the leader sequence exchanged by a heterologous (and 
probably more efficient) leader peptide. 
30 The rev-responsiveness of the secreted form 

ratTHY-lenv was measured by immunoprecipitation of 
supernatants of human 293T cells cotransfected with a 
plasmid expressing a secreted form of ratTHY-lenv and the 
RRE sequence in cis (rTHY-lenvPI-rre) and either CDM7 or 
35 pCMVrev. The rTHY-l envPi-RRE construct was made by PCR 
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using the oligonucleotides 
cgcggggctagcgcaaagagtaataagtttaac as forward and 
cgcggatcccttgtattttgtactaata a as reverse ori! 
syndetic rTHV-ienv construct as te^ate . ^ ^ 
5 digestion with Nhel and Notl the PCR fra™,,* , 
into a plasmid containing cos lead" an^se^ ceT 
Supernatants of 3S S labellfid ^ ^ ^J™- 
hours post transfection, precipitated with a mouse 

10 Zr antib0dy a9ainSt rTHY ' 1 and anti mouse lg G 

10 sepharose, and run on a 12% reducing sds-page 

in this experiment the induction of rTHY-ienv bv 

rev was much more prominent and clearcut than in the 

ahove-descrihed experiment and strongly suggest" that rev 

is able to translationally regulate transcripts that a" 

15 suppressed by low-usage codons. 

To test whether low-usage codons must be present 
throughout the whole coding seguence or whether a short 

rTOvTenv " ^ ™P-iveness, a 

rTHY lenv: immunoglobulin fusion protein was generated, 
in thxs construct the rTHY-ienv gene (without the 
seguence motif responsible for phosphatidylinositol 
glycan anchorage, is linked to the human Igci hinge CH2 

» u C ir ainS - C ™ Ct ~ ™ ^ 

PCR usxng primers with Nhel and BamHl restriction sites 

IntT:!' 1 " tCBPlate - The PCR fragment was cloned 
into a plasmid containing the leader seguence of the CD5 
surface molecule and the hinge, CH2 and CH3 parts of 
30 human IgGl immunoglobulin, a Hind3/Eagi fragment 
containing the rTHY-lenvegl insert was subsequently 
cloned into a pCDM7-derived plasmid with the rre 
sequence. 

To measure the response of the rTHY-ienv/ 
35 i-unoglobin ^sion gene (rTHY-ienvegirre, to rev human 
293T cells cotransfected with rTHY-ienvegirre and either 
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PCDM7 or pCMVrev. The rTHY-ienvegirre construct was made 
by anchor PCR using forward and reverse primers with Nhel 
and BamHl restriction sites respectively. The PCR 
fragment was cloned into a plasmid containing a CDS 
5 leader and human IgGl hinge, CH2 and CH3 domains. 

Supernatants of 35 S labelled cells were harvested 72 hours 
post transfection, precipitated with a mouse monoclonal 
antibody 0X7 against rTHY-1 and anti mouse igG sepharose, 
and run on a 12% reducing SDS-PAGE. The procedures used' 
are described in greater detail below. 

As with the product of the rTHY-lenvPl- gene, this 
rTHY-lenv/ immunoglobulin fusion protein is secreted into 
the supernatant. Thus, this gene should be responsive to 
rev- induct ion. However, in contrast to rTHY-lenvPl-, 
cotransfection of rev in trans induced no or only a 
negligible increase of rTHY-lenvegi expression. 

The expression of rTHY-l: immunoglobulin fusion 
protein with native rTHY-1 or HIV envelope codons was 
measured by immunoprecipitation. Briefly, human 293T 
cells transfected with either rTHY-lenvegl (env codons) 
or rTHY-lwtegl (native codons) . The rTHY-lwtegl 
construct was generated in manner similar to that used 
for the rTHY-lenvegl construct, with the exception that a 
plasmid containing the native rTHY-1 gene was used as 
template. Supernatants of 35 S labelled cells were 
harvested 72 hours post transfection, precipitated with a 
mouse monoclonal antibody 0X7 against rTHY-l and anti 
mouse IgG sepharose, and run on a 12% reducing SDS-PAGE. 
THe procedures used in this experiment are described in 
greater detail below. 

Expression levels of rTHY-lenvegl were decreased 
in comparison to a similar construct with wildtype rTHY-l 
as the fusion partner, but were still considerably higher 
than rTHY-lenv. Accordingly, both parts of the fusion 
protein influenced expression levels. The addition of 
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expression is due to translational differences and not 
mRNA stability. 

Retroviruses in general do not show a similar 
preference towards A and T as found for HIV. But if this 
5 family was divided into two subgroups, lentiviruses and 
non-lentiviral retroviruses, a similar preference to A 
and, less frequently, T, was detected at the third codon 
position for lentiviruses. Thus, the availing evidence 
suggests that lentiviruses retain a characteristic 

10 pattern of envelope codons not because of an inherent 
advantage to the reverse transcription or replication of 
such residues, but rather for some reason peculiar to the 
physiology of that class of viruses. The major 
difference between lentiviruses and non-complex 

15 retroviruses are additional regulatory and non- 
essential^ accessory genes in lentiviruses, as already 
mentioned. Thus, one simple explanation for the 
restriction of envelope expression might be that an 
important regulatory mechanism of one of these additional 

20 molecules is based on it. In fact, it is known that one 
of these proteins, rev, which most likely has homologues 
in all lentiviruses. Thus codon usage in viral mRNA is 
used to create a class of transcripts which is 
susceptible to the stimulatory action of rev. This 

25 hypothesis was proved using a similar strategy as above, 
but this time codon usage was changed into the inverse 
direction. Codon usage of a highly expressed cellular 
gene was substituted with the most frequently used codons 
in the HIV envelope. As assumed, expression levels were 

30 considerably lower in comparison to the native molecule, 
almost two orders of magnitude when analyzed by 
immunofluorescence of the surface expressed molecule (see 
4.7). If rev was coexpressed in trans and a RRE element 
was present in cis only a slight induction was found for 
35 the surface molecule. However, if THY-1 was expressed as 
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composition. This night indicate that the possibility of 
high expression is restored, and that the gene in fact 
has to be highly expressed at some point during viral 
pathogenesis. 

5 The results presented herein clearly indicate that 

codon preference has a severe effect on protein levels, 
and suggest that translational elongation is controlling 
mammalian gene expression. However, other factors may 
play ar role. First, abundance of not maximally loaded 

10 mRNA's in eukaryotic cells indicates that initiation is 
rate limiting for translation in at least some cases, 
since otherwise all transcripts would be completely 
covered by ribosomes. Furthermore, if ribosome stalling 
and subsequent mRNA degradation were the mechanism, 

15 suppression by rare codons could most likely not be 
reversed by any regulatory mechanism like the one 
presented herein. One possible explanation for the 
influence of both initiation and elongation on 
translational activity is that the rate of initiation, or 

20 access to ribosomes, is controlled in part by cues 

distributed throughout the RNA, such that the lentiviral 
codons predispose the RNA to accumulate in a pool of 
poorly initiated RNAs. However, this limitation need not 
be kinetic; for example, the choice of codons could 

25 influence the probability that a given translation 

product, once initiated, is properly completed. Under 
this mechanism, abundance of less favored codons would 
incur a significant cumulative probability of failure to 
complete the nascent polypeptide chain. The sequestered 

30 RNA would then be lent an improved rate of initiation by 
the action of rev. Since adenine residues are abundant 
in rev-responsive transcripts, it could be that RNA 
adenine methylation mediates this translational 
suppression. 
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nM Tris HC1, pH 7.5, 60 mM MgCl 2 , 50 mM NaCl, 4 mg/ml 
BSA, 70 dm 0-mercaptoethanol, 0.02% NaN 3 ) ; lOx Ligation 
additions (1 mM ATP, 20 mM DTT, 1 mg/ml BSA, 10 mM 
spermidine) ; 50x TAE (2 M Tris acetate, 50 mM EDTA) . 
5 Oligonucleotide synthesis and ourif icatinn 

Oligonucleotides were produced on a Milligen 8750 
synthesizer (Millipore) . The columns were eluted with 1 
ml of 30% ammonium hydroxide, and the eluted 
oligonucleotides were deblocked at 55 °C for 6 to 12 

10 hours. After deblockiong, 150 /U of oligonucleotide were 
precipitated with lOx volume of unsaturated n-butanol in 
1.5 ml reaction tubes, followed by centrifugation at 
15,000 rpm in a microfuge. The pellet was washed with 
70% ethanol and resuspended in 50 nl of H 2 0. The 

15 concentration was determined by measuring the optical 
density at 260 nm in a dilution of 1:333 (1 OD 260 - 30 
Mg/ml) . 

The following oligonucleotides were used for 
construction of the synthetic gpl20 gene (all sequences 
20 shown in this text are in 5' to 3' direction). 

oligo 1 forward (Nhel) : cgc ggg eta gec acc gag 
aag ctg (SEQ ID NO: 1) . 

oligo 1: acc gag aag ctg tgg gtg acc gtg tac tac 
ggc gtg ccc gtg tgg aag ag ag gec acc acc acc ctg ttc tgc 
25 gec age gac gee aag gcg tac gac acc gag gtg cac aac gtg 
tgg gee acc cag gcg tgc gtg ccc acc gac ccc aac ccc cag 
gag gtg gag etc gtg aacgtg acc gag aac ttc aac atg (SEQ 
ID NO: 2) . 

oligo 1 reverse: cca cca tgt tgt tct tec aca tgt 
30 tga agt tct c (SEQ ID NO: 3) . 

oligo 2 forward: gac cga gaa ctt caa cat gtg gaa 
gaa caa cat (SEQ ID NO: 4) 

oligo 2: tgg aag aac aac atg gtg gag cag atg cat 
gag gac ate ate age ctg tgg gac cag age ctg aag ccc tgc 
35 gtg aag ctg acc cc ctg tgc gtg acc tg aac tgc acc gac ctg 
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oligo 6: gcc aag tgg aac gac acc ctg cgc cag ate 
gtg age aag ctg aag gag cag ttc aag aac aag acc ate gtg 
ttc ac cag age age ggc ggc gac ccc gag ate gtg atg cac 
age ttc aac tgc ggc ggc (SEQ ID NO: 17) . 
5 oligo 6 reverse (EcoRl) : gca gta gaa gaa ttc gcc 

gcc gca gtt ga (SEQ ID NO: 18) . 

oligo 7 forward (EcoRl) : tea act gcg gcg gcg aat 
tct tct act gc (SEQ ID NO: 19). 

oligo 7: ggc gaa ttc ttc tac tgc aac acc age ccc 
10 ctg ttc aac age acc tgg aac ggc aac aac acc tgg aac aac 
acc acc ggc age aac aac aat att acc etc cag tgc aag ate 
aag cag ate ate aac atg tgg cag gag gtg ggc aag gcc atg 
tac gcc ccc ccc ate gag ggc cag ate egg tgc age age (SEQ 
ID NO: 20) 

15 oligo 7 reverse: gca gac egg tga tgt tgc tgc tgc 

acc gga tct ggc cct c (SEQ ID NO: 21) . 

oligo 8 forward: cga ggg cca gat ccg gtg cag cag 
caa cat cac egg tct g (SEQ ID NO: 22) . 

oligo 8: aac ate acc ggt ctg ctg ctg acc cgc gac 
20 ggc ggc aag gac acc gac acc aac gac acc gaa ate ttc cgc 
ccc ggc ggc ggc gac atg cgc gac aac tgg aga tct gag ctg 
tac aag tac aag gtg gtg acg ate gag ccc ctg ggc gtg gcc 
ccc acc aag gcc aag cgc cgc gtg gtg cag cgc gag aag cgc 
(SEQ ID NO: 23) . 
25 oligo 8 reverse (Notl) : cgc ggg egg ccg ctt tag 

cgc ttc teg cgc tgc acc ac (SEQ ID NO: 24) . 

The following oligonucleotides were used for the 
construction of the ratTHY-lenv gene. 

oligo 1 forward (BamHl/Hind3) : cgc ggg gga tec 
30 aag ctt acc atg att cca gta ata agt (SEQ ID NO: 25) . 

oligo 1: atg aat cca gta ata agt ata aca tta tta 
tta agt gta tta caa atg agt aga gga caa aga gta ata agt 
tta aca gca tct tta gta aat caa aat ttg aga tta gat tgt 
aga cat gaa aat aat aca aat ttg cca ata caa cat gaa ttt 
35 tea tta acg (SEQ ID NO: 26). 
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was complemented with 10% DMSO to increase fidelity of 

the Taq polymerase. 

Small scale Wh preparation 

Transformed bacteria were grown in 3 ml LB 
5 cultures for more than 6 hours or overnight. 

Approximately 1.5 ml of each culture was poured into 1.5 
ml microfuge tubes, spun for 20 seconds to pellet cells 
and resuspended in 200 pi of solution I. Subsequently 
400 pi of solution II and 300 pi of solution III were 

10 added. The microfuge tubes were capped, mixed and spun 
for > 30 sec. Supernatants were transferred into fresh 
tubes and phenol extracted once. DNA was precipitated by 
filling the tubes with isopropanol, mixing, and spinning 
in a microfuge for > 2 min. The pellets were rinsed in 

15 70 % ethanol and resuspended in 50 pi dH20 containing 10 
pi of RNAse A. The following media and solutions were 
used in these procedures: LB medium (1.0 % NaCl, 0.5% 
yeast extract, 1.0% trypton) ; solution I (10 mM EDTA pH 
8.0); solution II (0.2 M NaOH, 1.0% SDS) ; solution III 

20 (2.5 M KOAc, 2.5 M glacial aceatic acid); phenol (pH 
adjusted to 6.0, overlaid with TE) ; TE (10 mM Tris HC1, 
pH 7.5, 1 mM EDTA pH 8.0). 
Large scale DNA preparation 

One liter cultures of transformed bacteria were 

25 grown 24 to 36 hours (MC1061p3 transformed with pCDM 
derivatives) or 12 to 16 hours (MC1061 transformed with 
puc derivatives) at 37°C in either M9 bacterial medium 
(pCDM derivatives) or LB (pUC derivatives) . Bacteria 
were spun down in 1 liter bottles using a Beckman J6 

30 centrifuge at 4,200 rpm for 20 min. The pellet was 

resuspended in 40 ml of solution I. Subsequently, 80 ml 
of solution II and 40 ml of solution III were added and 
the bottles were shaken semivigorously until lumps of 2 
to 3 mm size developed. The bottle was spun at 4,200 rpm 

35 for 5 min and the supernatant was poured through 
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sequencing 

Synthetic genes were sequenced by the Sanger 
dideoxynucleotide method, in brief, 20 to 50 /ig double- 
stranded plasmid DNA were denatured in 0.5 M NaOH for 5 
5 min. Subsequently the DNA was precipitated with l/io 
volume of sodium acetate (pH 5.2) and 2 volumes of 
ethanol and centrifuged for 5 min. The pellet was washed 
with 70% ethanol and resuspended at a concentration of l 
ng/fil. The annealing reaction was carried out with 4 fig 

10 of template DNA and 40 ng of primer in lx annealing 
buffer in a final volume of 10 /tl. The reaction was 
heated to 65 °C and slowly cooled to 37»c. In a separate 
tube 1 pi of 0.1 M DTT, 2 Ail of labeling mix, 0.75 fil of 
dH 2 0, 1 pi of [ 35 S] dATP (10 uCi) , and 0.25 nl of 

15 Sequenase- (12 U//*l) were added for each reaction. Five 
Ml of this ...ix were added to each annealed primer- 
template tube and incubated for 5 min at room 
temperature. For each labeling reaction 2.5 pi of each 
of the 4 termination mixes were added on a Terasaki plate 

20 and prewarmed at 37 «c. At the end of the incubation 

period 3.5 pi of labeling reaction were added to each of 
the 4 termination mixes. After 5 min, 4 pi of stop 
solution were added to each reaction and the Terasaki 
plate was incubated at 80°C for 10 min in an oven. The 

25 sequencing reactions were run on 5% denaturing 

polyacrylamide gel. An acrylamide solution was prepared 
by adding 200 ml of lOx TBE buffer and 957 ml of dH 2 0 to 
100 g of acrylamide :bisacrylamide (29:1). 5% 
polyacrylamide 46% urea and lx TBE gel was prepared by 

30 combining 38 ml of acrylamide solution and 28 g urea. 

Polymerization was initiated by the addition of 400 Ail of 
10% ammonium peroxodisulfate and 60 pi of TEMED. Gels 
were poured using silanized glass plates and sharktooth 
combs and run in lx TBE buffer at 60 to 100 W for 2 to 4 

35 hours (depending on the region to be read) . Gels were 
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transferred to Whataan blottino » 
-«* X hour , and expose^to :!j; P ;r; ^ " -C *r 
temperature. Typical! v 7 ln at roott 

5 *»—««* buffer (200 „„ » th "« Procedures, 5x 
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cos * ^JirSJ TTolV 95 : forM -"" » « 
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cytoplasmic rha was i _ 
" transacted »3 T cells 
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"ours pest infection essln^'l * Mlls " 
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culture cells. In current. * fr ° B " Ssue 

" Ausubel etT" s °„° COlS la Mol ~"« 

"«>• Briefly, cells 'were * ' ^ Nev York, 

nuclei were spun out aXs^d 4 °° ^ bU "~' 
^ .... and 0.2 s g/Bl respecti: ely Pr °I einaSe K — «— 
extracts were incubated at 37.T, ^W-ic 
« Pbonovcnlorofor. extraoted t„L " " ln ' 

*»* was dissolved in 100 "* Palpitated. The 

"•c for 20 sin. The ' M lnc "bated at 

« "op b „„ er and prec:;r: a t e d w :;ai s n toppea by adain ' » 
- procedu« : \; s ";;^ l r ons were — * 

buffer with 10 „ " «-„ Buffer z (TE 

-«er (so „ EDTA lsTh.^ H ^ J " 
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Slot bias analysis 

For slot blot analysis 10 fig of cytoplasmic RNA 
was dissolved in 50 Ml dH 2 0 to which 150 Ml of lOx 
SSC/18% formaldehyde were added. The solubilized RNA was 
5 then incubated at 65 °C for 15 min and spotted onto with a 
slot blot apparatus. Radioactively labelled probes of 
1.5 kb gpl20IIlb and syngpl20mn fragments were used for 
hybridization. Each of the two fragments was random 
labelled in a 50 ftl reaction with 10 Ml of 5x oligo- 
10 labelling buffer, 8 Ml of 2.5 mg/ml BSA, 4 Ml of «[ 32 P]- 
dCTP (20 uCi/Ml; 6000 Ci/mmol) , and 5 U of Klenow 
fragment. After 1 to 3 hours incubation at 37 °c 100 Ml 
of TE were added and unincorporated «[ 32 P]-dCTP was 
eliminated using G50 spin column. Activity was measured 
in a Beckman beta-counter, and egual specific activities 
were used for hybridization. Membranes were pre- 
hybridized for 2 hours and hybridized for 12 to 24 hours 
at 42 °C with 0.5 x 10 6 cpm probe per ml hybridization 
fluid. The membrane was washed twice (5 min) with 
washing buffer I at room temperature, for one hour in 
washing buffer II at 65°C, and then exposed to x-ray 
film. Similar results were obtained using a 1.1 kb 
Notl/Sfil fragment of pCDN7 containing the 3 untranslated 
region. Control hybridizations were done in parallel 
with a random- labelled human beta-actin probe. RNA 
expression was guantitated by scanning the hybridized 
nitrocellulose membranes with a Magnetic Dynamics 
phosphor imager . 

The following solutions were used in this 
0 procedure: 

5x Oligo-labelling buffer (250 mM Tris HC1, pH 8.0, 25 mM 
MgCl 2 , 5 mM 0-mercaptoethanol, 2 mM dATP, 2mM dGTP, 2mM 
dTTP, 1 M Hepes pH 6.6, 1 mg/ml hexanucleotides [dNTP]6); 

Hybridization Solution ( M sodium phosphate, 250 mM 

NaCl, 7% SDS, 1 mM EDTA, 5% dextrane sulfate, 50% 
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0.1. SDS); Washing buffar II (0.5X SSC. 0.1 , SOS,; 20x 
5 Yacgjmn recomhir^^n 

Vaccinia recombination used a modification of the 
of the method described by Romeo and Seed (Romeo and 
Seed, cell, 64: 103 7, 1991) . Briefly , ^ ^ at ?o ^ 
90* confluency were infected with i to 3 „1 of a wildtype 
10 vaccinia stock WR (2 x 10* pf u /ml) for 1 hour in culture 
medium without calf serum. After 24 hours, the cells 
were transfected by calcium phosphate with 25 M9 TKG 
Plasmid DNA per dish. After an additional 24 to 48 hours 
the cells were scraped off the plate, spun down, and 
15 resuspended in a volume of i nl . After 3 free ze/thaw 
cycles trypsin was added to 0.05 mg/ml and lysates were 
incubated for 20 min. a dilution series of io, i and 0.1 
Ml of this lysate was used to infect small dishes (6 cm) 
of CVl cells, that had been pretreated with 12.5 M g/ml 
20 mycophenolic acid, 0.25 mg/ml xanthin and 1.36 mg/ml 
hypoxanthine for 6 hours. Infected cells were cultured 
for 2 to 3 days, and subsequently stained with the 
monoclonal antibody NEA9301 against g P i20 and an alkaline 
phosphatase conjugated secondary antibody. Cells were 
incubated with 0.33 mg/ml NET and 0.16 mg/ml BCIP in AP- 
buffer and finally overlaid with 1% agarose in PBS 
Positive plaques were picked and resuspended in 100 M l 
Tris p H 9.0. The plaque purification was repeated once 
To produce high titer stocks the infection was slowly 
scaled up. Finally, one large plate of Hela cells was 
infected with half of the virus of the previous round, 
infected cells were detached in 3 ml of PBS, lysed with a 
Dounce homogenizer and cleared from larger debris by 
centrifugation. VPE-8 recombinant vaccinia stocks were 
kindly provided by the AIDS repository, Rockviiie, md 



25 
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and express HIV-i iiib gpi20 under the 7.5 nixed 
early/late promoter (Earl et al., j. virol., 65:31, 
1991) . in all experiments with recombinant vaccina cells 
were infected at a multiplicity of infection of at least 
5 10. 

The following solution was used in this procedure: 
AP buffer (100 mM Tris HC1, pH 9.5, 100 mM MaCl, 5 mM 
MgCl 2 ) 

Cell culture 

10 The monkey kidney carcinoma cell lines cvi and 

Cos7, the human kidney carcinoma cell line 293T, and the 
human cervix carcinoma cell line Hela were obtained from 
the American Tissue Typing Collection and were maintained 
in supplemented IMDM. They were kept on 10 cm tissue 

15 culture plates and typically split 1:5 to 1:20 every 3 to 
4 days. The following medium was used in this 

procedure: 

Supplemented IMDM (90% Iscove's modified Dulbecco Medium, 
10% calf serum, iron-complemented, heat inactivated 30 
20 min 56»C, 0.3 mg/ml L-glutamine, 25 /ig/ml gentamycin 0.5 
mM 0-mercaptoethanol (pH adjusted with 5 M NaOH, 0.5 
ml)). 

Transfectio n 

Calcium phosphate transfection of 293T cells was 

25 performed by slowly adding and under vortexing 10 fig 

plasmid DNA in 250 fil 0.25 M CaCl 2 to the sane volume of 
2x HEBS buffer while vortexing. After incubation for 10 
to 30 min at room temperature the DNA precipitate was 
added to a small dish of 50 to 70% confluent cells. In 

30 cotransfection experiments with rev, cells were 
transfected with 10 Mg gpi20IIIb, gpl20Illbrre, 
syngpl20mnrre or rTHY-lenveglrre and 10 ng of pCMVrev or 
CDM7 plasmid DNA. 
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»™ h f ° ll0Wing s °l«tions were used i„ this 

^""""—IrltnrlUM "Clavedj. 

5 cells weT? 4 l t0 S ° ^ BediUB — •«*•»■- and 
cells were incubated for additional 12 hours in cvs,m , 
free M diu* containino 200 „ci of ^-tran^i 

r~e-rr~ r -« 15 1 - - 

100 „ ff/a i respectively i »i „, ' 5 ° 

15 (kindly provided by Behrino) rail 

4 , c fflr „ . y enrmgj (all gpi20 constructs) at 

rut - 

25 £J if a r" C "°" " d l ° * ~ th "» 1 ' -^ted wiL 
25 ABplrry for 20 .in. dried and exposed for .2 hours 

this « , tollovi ^ buffers and solutions were used in 
this procedure: Wash buffer (100 „ niSi * » 

Trie, 1.25 H Glycin, 0.5% SOS); loading buffer (10 % 
glycerol, 4, SOS. 4, *-..r=aptoethanol! 



blue,. »-«r=aptoetha„ol, 0.02 % brosphenol 

IlMllunnfl M „ rrn - rnrn 

293T cells were transfected by calcius phosphate 

35 af°C C 3 1P d tati ° n "* aMly " a ^ ^ «"l 

after 3 days. After d.tachnent with 1 SM eota/pbs, cells 
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were stained with the monoclonal antibody OX-7 in a 
dilution of 1:250 at 4»C for 20 min, washed with PBS and 
subsequently incubated with a 1:500 dilution of a FITC- 
conjugated goat anti-mouse immunoglobulin antiserum. 
5 Cells were washed again, resuspended in 0.5 ml of a 
fixing solution, and analyzed on a EPICS XL 
cytof luorometer (Coulter) . 

The following solutions were used in this 
procedure: 

10 PBS (137 mM NaCl, 2.7 mM KC1, 4.3 mM Na 2 HP0 4 , 1.4 mM 
KH 2 P0 4 , P H adjusted to 7.4); Fixing solution (2% 
formaldehyde in PBS) . 
ELISA 

The concentration of gpl20 in culture supernatants 

15 was determined using CD4 -coated ELISA plates and goat 
anti-gpl20 antisera in the soluble phase. Supernatants 
of 293T cells transfected by calcium phosphate were 
harvested after 4 days, spun at 3000 rpm for 10 min to 
remove debris and incubated for 12 hours at 4»C on the 

20 plates. After 6 washes with PBS 100 m1 of goat anti- 

gpl20 antisera diluted 1:200 were added for 2 hours. The 
plates were washed again and incubated for 2 hours with a 
peroxidase-conjugated rabbit anti-goat IgG antiserum 
1:1000. Subsequently the plates were washed and 

25 incubated for 30 min with 100 Ml of substrate solution 
containing 2 mg/ml o-phenylenediamine in sodium citrate 
buffer. The reaction was finally stopped with 100 Ml of 
4 M sulfuric acid. Plates were read at 490 nm with a 
Coulter microplate reader. Purified recombinant 

30 gpi20IIlb was used as a control. The following buffers 
and solutions were used in this procedure: Hash buffer 
(0.1% NP40 in PBS); Substrate solution (2 mg/ml o- 
phenylenediamine in sodium citrate buffer) . 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: SEED, BRIAN 

(ii) TITLE OF INVENTION: OVEREXPRESSION OF MAMMALIAN AND VIRAL 
PROTEINS 

(iii) NUMBER OF SEQUENCES: 37 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Fish fi Richardson 

(B) STREET: 225 Franklin Street 

(C) CITY: Boston 

(D) STATE: Massachusetts 

(E) COUNTRY: U.S.A. 

(F) ZIP: 02110-2804 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Ralaasa #1.0, Version #1.308 

(Vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/308.286 

(B) FILING DATE: 19-SEP-1994 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: CLARK, PAUL T 

(B) REGISTRATION NUMBER: 30,162 

(C) REFERENCE/DOCKET NUMBER: 00786/226001 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (617) 542-5070 

(B) TELEFAX: (617) 542-8906 

(C) TELEX: 200154 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
CGCGGGCTAG CCACCGAGAA GCTG 
(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 196 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION i SEQ ID NO:2- 

™«c r^r^ 

«=T=TTC ^COCCCCO *0«««OC CXXCOV^CC 

™ M cc _ „ = 

CGAGAACTTC AACATG 

(2) INFORMATION FOR SEQ ID NO: 3: 196 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pair. 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: a ingle 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 
CCACCATGTT GTTCTTCCAC ATGTTGAAGT TCTC 
(2> INFORMATION FOR SEQ ID NO: 4: 

(i) SE ^^CHARACTERISTICS: 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : einole 

(D) TOPOLOCY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
GACCCAGAAC TTCAACATGT GGAAGAACAA CAT 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS' 

(A) LENGTH: 192 base paira 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:S: 
TGGAAGAACA ACATGGTGGA GCAGATCCAT GAGCACATCA TCAGCCTCTG GGACCAGAGC 
CTGAAGCCCT GCGTGAAGCT GACCCCCTGT GCGTGACCTG AACTGCACCG ACCTGAGGAA 120 
CACCACCAAC ACCAACACAG CACCGCCAAC AACAACAGCA ACAGCGAGGG CACCATCAAG 180 
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GGCGGCGAGA TC 

192 

(2) INFORMATION FOR SEQ ID NOi6t 

(i) SEQUENCE CHARACTERISTICS t 
(A) LENGTH: 33 baa* pairs 
(8) TYPE i nucleic acid 

(C) STRANDEONESS: aingl* 

(D) TOPOLOGY j linear 



(Xi) SEQUENCE DESCRIPTION t SEQ ID NO<6t 
CTTCAAOCTC CAGTTCTTCA TCTCGCCCCC CTT 
(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 baaa paira 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: ■ ingle 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
GAAGAACTGC ACCTTCAACA TCACCACCAG C 
(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 195 baa* paira 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: aingl* 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: 
AACATCACCA CCAGCATCCG CGACAAGATG CAGAACGAGT ACCCCCTGCT GTACAAGCTG 
GATATCGTGA GCATCGACAA CGACAGCACC AGCTACCGCC TGATCTCCTG CAACACCAGC 
GTGATCACCC AGGCCTGCCC CAAGATCACC TTCGAGCCCA TCCCCATCCA CTACTGCGCC 
CCCGCCGGCT TCGCC 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 baaa paira 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: aingl* 

(D) TOPOLOGY: linear 
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<xi> SEQUENCE DESCRIPTION: SEQ ID NO:9, 

CAACTTCTTC TCCCCCGCGA ACCCGCCGGG 

(2) INFORMATION FOR SEQ ID NO:10: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 47 base naira 
B) TYPE, nucleic acid ' 
fP^NOBDKESS: single 
(D) TOPOLOGY: linear 



<xi> SEQUENCE DESCRIPTION: SEQ ID NO:10, 
CCGCCCCCGC CGGCTTCGCC ATCCTGAAGT GCAACGACAA GAAGTTC 
(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 198 ba.e palra 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : ■ ingle 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GCCGACAAGA AGTTCAGCGO CAAGGGCAGC TGCAAGAACG WAGCACCGT GCAGTGCACC 6Q 
— — -CTCCTGA ACCCCACCCT CCC^ , 

AATCAGAGCG TGCAGATC 

(2) INFORMATION FOR SEQ ID NO: 12: 

(1) SEQUENCE CHARACTERISTICS, 
<*J £^2™' 34 baee p» lr . 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



198 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
ACTTCGCACG CGTGCAGTTG ATCTGCACGC TCTC 
(2) INFORMATION FOR SEQ ID NO: 13, 
(i) SEQUENCE CHARACTERISTICS: 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: ■ ingle 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
GAOAGCCTGC AGATCAACTG CACGCGTCCC 
(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 120 baa* paira 

(B) TYPE: nucleic acid 

(D) TOPOLCWY^linoar" 91 " 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: 
AACTCCACGC GTCCCAACTA CAACAAGCGC AAGCGCATCC ACATCGGCCC CGGGCGCGCC 
TTCTACACCA CCAAGAACAT CATCGGCACC ATCCTCCAGC CCCACTGCAA CATCTCTAGA 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: aingla 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
GTCGTTCCAC TTCGCTCTAG AGATGTTGCA 
(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 baae pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ 10 NO:16: 
GCAACATCTC TAGAGCCAAG TGGAACGAC 
(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 
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<*i> SEQUENCE DESCRIPTION: SEQ ID K0 . 17 . 
CCCAACTCCA ACCACACCCT CCCCCACATo ~~ 

— « . r;r: r™; ~* — 

ACTCCGCCOG C ^ACCCCCA CATCCTCATC CACACCTTCA 



120 



(2) INFORMATION FOR SEQ ID 131 



SEQUENCE DESCRIPTION: SEQ „ NOsl8j 
CCACTACAAC AATTCCCCCC CCCACTTCA 
(2) INFORMATION PO R SEQ I D Noa9j 

(») TYPE: nucleic 
(c) strandedneIs* 

(D) TOPOLOGy . ll„.« ' * 

<«i> SEQUENCE DESCRIPTION: SEQ „ 
TCAACTOCOC CCCCCAATTC TTCTACT6C 
(2> INFO **«lON FOR SEQ ID NO:20: 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS- ainol- 

(D) TOPOLOcyriln.^" 91 * 



(XI) SEQUENCE DESCRIPTION: SEQ ID NO-20- 

^^rrr-' 

«-a-. JZZ TCCBSKC " """"^ » 

""""WIOH FOB 5E fl re HO!jl! »s 
(1 > SE 0>«»« CHXMcnmsTics, 
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(A) LENGTH t 40 base pair* 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: a ingle 
(O) TOPOLOGY i linear 



(xi) SEQUENCE DESCRIPTIONS SEQ ID NO:21i 
GCAGACCGGT GATGTTCCTG CTGCACCGGA TCTGGCCCTC 40 
(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 40 baas pairs 

(B) TYPE: nuclaie acid 

(C) STRANDEDNESS : aingla 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
CGAGGCCCAG ATCCCGTGCA CCAGCAACAT CACCGGTCTG 40 
(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 242 base paira 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

AACATCACCG GTCTGCTCCT GCTGCTGACC CGGACGGCGG CAAGGACACC GACACCAACG 60 

ACACCCAAAT CTTCCGCGAC GGCGGCAACG ACACCAACCA CACCGAAATC TTCCCCCCCC 120 

GCGGCGGCGA CATGCGCGAC AACTGGAGAT CTGAGCTGTA CAAGTACAAG GTGGTGACGA 1B0 

TCGAGCCCCT GGGCGTCGCC CCCACCAAGG CCAAGCCCGC GGTGGTGCAG CGCCAGAAGC 240 

GC 242 
(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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<*i> SEQUENCE DESCRIPTION: SEQ ID KO , 24 . 
««^CCCC COCTTTACCO CTTCTCCCCC TCCACCAC 

(2) INFORMATION FOR SEQ ID KO : 2 S, 38 

, ( n Hf Es "«el«ic acid 
In! flS^EONESSt .ingi. 
(O) TOPOLOGY, linear 

C*i) SEQUENCE DESCRIPTION.. SEQ „ 
CCCCCGCGAT CCAACC ™^ CATGATTCCA GTAATAACT 

(2) INFORMATION FOR SEQ ID NO, 26- 39 

g saaaaral. 

(D) TOPOLOCE. 11.,, ' 
(-1. SEQUENCE OESCMPTxce, ^ „ „,„, 

2zz zzi: rr ~— ■ — - - 

<2) INF °««ATION FOR SEQ I0 NOl27s "5 

IS SSSS^i^ 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO-27- 
CGCGGGGAAT TCACGCGTTA ATGAAAATTC ATGTTG 

(2) INFORMATION POR SEQ ID NO, 28 , 36 

U) S ff^ a CHARACTERISTICS: 
(A) LENGTH: 30 baae £71. 
B) TYPE: nucleic !cid 

(C) STRANDEDNESS: .ingl. 

(D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: 
CCCGCATCCA CCCGTCAAAA AAAAAAACAT 
(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 149 baa* pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
CCTGAAAAAA AAAAACATGT ATTAAGTGGA ACATTACGAG TACCAGAACA TACATATAGA 60 
AGTAGAGTAA TTTGTTTAGT GATAGATTCA TAAAAGTATT AACATTAGCA AATTTTACAA 120 
CAAAAGATGA AGGAGATTAT ATGTGTCAG 149 
(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(0) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO>30: 
CCCGAATTCG AGCTCACACA TATAATCTCC 30 
(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 bass pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
CGCCGATCCG AGCTCAGAGT AAGTGGACAA 30 
(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 170 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 
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<xi> SEQUENCE DESCRIPTION: SEQ ID NO-32- 
CTCAGACTAA CTCCACAAAA TCCAACAAGT AGTAATA**. 

— — ™ ~ ~ ~ T - 
mwcTTTT ™»„» lJ0 

(2) IWOMATIOK TOR SSQ ID »o, 33 , "° 

(A) LENGTH: 36 baa* M ir« 

(B) TYPE: nuen* acij 
In "REDNESS: ainala 
(O) TOPOLOGY: linaar 



(XI) SEQUENCE DESCRIPTION : SEQ I„ n 0s33 . 
CCCCAATTCG CGCCCCCTTC ATAAACTTAT AAAATC 
(2) INFORMATION FOR SEQ ID NO,34, 
U> SEQUENCE CHARACTERISTICS: 

a iSS™' 1632 *>a.ala ir . 
I? H PE! nuc l«ic acid 
C) STRANDEDNESS: aingl. 
(0) TOPOLOGY: linaar 



(xi) SEQUENCE DESCRIPTION : SEQ ID N0 34- 

= rr: — — 

«~» ~r rzr *~ 

GCCACCGACO prur, WVCCAC CCTGTTCTCC 

====== 

*«AAGA ACAACATGGT CGAGCAGATG CATCAGCACA Tc»Tr„ 

ACCCTGAAGC CCTGCGTGAA GCTGACCC^ J ^TCAGCCT GTGGGACCAG 

GCTCACCCCC CTGTCCGTCA CCCTCAACTG CACCCAfv-r^ 

====== 

====== 

—===== 
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CACCCCATCC GGCCGGTCGT GAGCACCCAG CTCCTGCTGA ACGGCAGCCT GGCCGAGGAG 
GAGGTGGTGA TCCGCACCGA GAACTTCACC GACAACGCCA AGACCATCAT CGTGCACCTG 
AATCACAGCG TGCAGATCAA CTGCACGCGT CCCAACTACA ACAACCCCAA GCGCATCCAC 

ATOGGCCCCC GCCGCGCCTT CTACACCACC AAGAACATCA TCGGCACCAT CCCCCAGGCC 1080 

CACTGCAACA TCTCTAGAGC CAAGTGGAAC GACACCCTGC GCCAGATCGT GAGCAAGCTG 1140 

AAGGAGCAGT TCAAGAACAA GACCATCGTG TTCAACCAGA GCAGCGGCGG CGACCCCCAG 1200 

ATCGTGATGC ACAGCTTCAA CTGCGGCGGC GAATTCTTCT ACTGCAACAC CAGCCCCCTG 1260 

TTCAACAGCA CCTCGAACGG CAACAACACC TGGAACAACA CCACCGCCAG CAACAACAAT 1320 

ATTACCCTCC AGTGCAAGAT CAAGCAGATC ATCAACATGT GGCAGGAGGT GGGCAAGGCC 1380 

ATGTACGCCC CCCCCATCGA GGGCCAGATC CGGTGCAGCA GCAACATCAC CGGTCTGCTG 1440 

CTGACCCGCG ACGGCGGCAA GGACACCGAC ACCAACGACA CCGAAATCTT CCCCCCCGCC 1500 

GGCGGCGACA TGCGCGACAA CTGGAGATCT GAGCTGTACA AGTACAAGGT GGTGACGATC 1560 

GAGCCCCTGG GCGTGGCCCC CACCAAGGCC AAGCCCCGCG TCGTGCACCG CCAGAAGCGC 1620 

TAAAGCGGCC GC 1632 
(2) INFORMATION FOR SEQ ID NO: 35: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2481 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3S: 

ACCGAGAAGC TGTGGGTGAC CGTGTACTAC GGCGTGCCCG TCTGGAAGGA GGCCACCACC 60 

ACCCTGTTCT GCGCCAGCCA CGCCAAGGCG TACGACACCC AGCTGCACAA CCTGTGGCCC 120 

ACCCAGGCGT GCGTGCCCAC CGACCCCAAC CCCCAGGAGC TGGAGCTCCT GAACGTCACC 180 

GAGAACTTCA ACATGTGGAA GAACAACATG CTGGAGCAGA TGCATCACGA CATCATCACC 240 

CTGTGGGACC AGAGCCTGAA GCCCTGCGTG AAGCTGACCC CCCTGTGCGT GACCCTGAAC 300 

TGCACCGACC TGAGCAACAC CACCAACACC AACAACAGCA CCGCCAACAA CAACACCAAC 360 

AGCGAGGGCA CCATCAAGGG CGGCGAGATG AAGAACTGCA GCTTCAACAT CACCACCACC 420 

ATCCCCGACA AGATGCAGAA GGAGTACGCC CTGCTGTACA AGCTGGATAT CGTGAGCATC 480 

CACAACGACA GCACCAGCTA CCGCCTGATC TCCTGCAACA CCAGCGTGAT CACCCAGGCC 540 

TGCCCCAAGA TCAGCTTCGA GCCCATCCCC ATCCACTACT GCGCCCCCGC CGCCTTCCCC 600 

ATCCTGAAGT GCAACGACAA GAAGTTCAGC GGCAAGCCCA GCTCCAAGAA CGTGACCACC 660 
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CTCCACTOai <=»««" CCCCCCCCTC «0 MC »CCC AOCTCCTCCT OAACCSC** 

. =™*««q accmcko, catccccacc 0 « umcJ1 ccckwccc auu»ca« 

T °"™ CAO °~" "™ c — — — 

— TTCTXC»C» CCA»=„CXT „TCO=C«C 

' TOa " <=«"™=« "tctci*™ oco^roo. xccacaccct cccccwatc 

'TCHXVCC TCAACCXOCX CTTCAACAAC ^ 
— — A=»TC=TC„ CCACACCTTC «CTCO«CC= CCCTTCTT CXACTCCAAC 
«X«CCCCe WKUO, CACCTMMC CCCA^CA CCTKA**, CACCCCOOC 
,=CAACAACA " miCOT -W-M. ATCAACCAGA TC.TCAACAT CTCCCACCAC 
OIOCOC^ CCATCTACOC CCCCCCCAK GMCcccAg, TCCOC^CO OCCAACC 
ACCKTCTCC KCKACCCO CSACCOCCCC AACCACACCC ACACCAACGA CACCGAAATC 



TTCCCCCCCC GCGGCGGOGA CATGCGCGAC AACTGGAGAT CTGAGCTGTA CAAGTACAAG 
CTGGTGACGA TCCACCCCCT GGGCGTGGCC CCCACCAACC CCAAGCOCCG CGTGGTGCAG 1440 
CGCGAGAAGC GGGCCCCCAT CGGCCCCCTG TrcCTGGOCT XCCTGGGGGC OGCGGGCAGC 150 0 
ACCATGCCGO CCGCCAGCGT GACCCTCACC GTGCAGCCCC GCCTGCTCCT GAGCGGCATC 



720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 



-——•WW* (MWCBBCMC 1560 

GTGCAGCACC AGAACAACCT CCTCCGCGCC ATCGAGGCCC ACCAGCATAT GCTCCAGCXC 1„0 

ACCCTGTCGG GCATCAAGCA GCTCCAGGCC CGCGTGCTGC CCGTCGACCG CTACCTCAAC 1680 

CACCAGCAGC TCCXGGGCTX CTGGGGCTGC XCCGGCAAGC TGATCTCCAC CACCACGGXA 17 40 

CCCTCGAACC CCTCC ^ M CTGCAOGACA XCTGGAACAA CATGACCTGC 18 00 

ATGCAGTGGG AGCGCGAGAT CGATAACTAC ACCAGCCTGA XCXACAGCCT GCTGGAGAAG 1860 

AGCCAGACCC AGCAGGAGAA GAAOGAGCAG GAGCTGCTGG AGCTGGACAA CTGGGOGAGC 1920 

CTGTCGAACT CGTTCGACAT CACCAACTGG CTGTCGTACA TCAAAATCTT CATCATGATT 1 980 

GTGGCCGGCC TGGTGGGCCT CCGCATCGTG TTCGCOGTGC TGAGCATOGT CAACCGCCTG 2040 

CCCCAGGGCX ACAGCCCCCT GAGCCTCCAG ACCCGGCCCC CCGXGCCGCG CGGGCCCGAC 2100 

CGCCCCGAGG GCATCGAGGA GGAGGGCGGC GAGCGCGACC GCGACACCAG CGGCAGGCTC 2160 

GTGCACGGCT TCCTGGCGAT CATCTGGGTC GACCTCCGCA GCCTCTTCCT GTTCAGCTAC 2220 

CACCACCGCG ACCTGCTGCT GATCGCCGCC OGCATCGTGG AACTCCTAGG CCGCCGCGGC 2280 

TGGGAGGTGC TGAAGTACTG GTGGAACCTC CTCCAGTATT GGAGCCAGGA GCTGAAGTCC 2340 

AGCGCCGTGA GCCTGCTGAA CGCCACCGCC ATCGCCGTGC CCGAGGGCAC CGACCGCGTG 2400 

ATOGAGGTGC TCCAGAGGGC CGGGAGGGCG ATCCTGCACA TCCCCACCCG CATCCGCCAG 2460 
GGCCTCGAGA GGCCCCTGCT G 

2481 
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(2) INFORMATION FOR SKQ 10 NO. 36: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH > 486 base pairs 

(B) TYPE: nuclaic acid 

(C) STRANDEDNESS: aingla 

(D) TOPOLOGY t linaar 



(xi) SEQUENCE DESCRIPTION! SEQ ID N0:36: 
ATGAATCCAG TAATAAGTAT AACATTATTA TTAAGTGTAT TACAAATGAC TACAGGACAA 60 
ACAGTAATAA GTTTAACAGC ATCTTTAGTA AATCAAAATT TGAGATTACA TTGTACACAT 120 
GAAAATAATA CACCTTTGCC AATACAACAT GAATTTTCAT TAACGCCTGA AAAAAAAAAA 
CATGTATTAA GTGGAACATT AGGAGTACCA GAACATACAT ATAGAAGTAG AGTAAATTTG 
TTTAGTGATA GATTCATAAA AGTATTAACA TTAGCAAATT TTACAACAAA ACATGAACGA 300 
GATTATATCT GTGACCTCAG AGTAAGTCGA CAAAATCCAA CAACTAGTAA TAAAACAATA 360 
AATGTAATAA GAGATAAATT AGTAAAATGT GGAGGAATAA GTTTATTAGT ACAAAATACA 
AGTTGGTTAT TATTATTATT ATTAAGTTTA AGTTT TTTA C AAGCAACAGA TTTTATAACT 
TTATGA 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48S baa* pairs 

(B) TYPE: nuclaic acid 

(C) STRANDEDNESS: aingla 

(D) TOPOLOGY: linaar 



180 
240 



420 
480 
486 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

ATGAACCCAG TCATCAGCAT CACTCTCCTG CTTTCAGTCT TGCAGATGTC CCGAGGACAG 60 

AGGGTGATCA CCCTGACAGC CTGCCTGGTG AACAGAACCT TCCACTCGAC TGCCCTCATC 120 

AGAATAACAC CAACTTGCCC ATCCAGCATG AGTTCAGCCT CACCCGAGAG AACAAGAACC 180 

ACGTGCTGTC AGGCACCCTG GGGCTTCCCG AGCACACTTA CCCCTCCCCC GTCAACCTTT 240 

TCAGTGACCG CTTTATCAAG GTCCTTACTC TAGCCAACTT GACCACCAAG GATGAGGGCG 300 

ACTACATGTG TCAACTTCGA GTCTCCGGCC ACAATCCCAC AACCTCCAAT AAAACTATCA 360 

ATGTGATCAG AGACAAGCTG GTCAAGTCTG GTGGCATAAG CCTCCTGCTT CAAAACACTT 420 

CCTGGCTGCT GCTCCTCCTG CTTTCCCTCT CCTTCCTCCA AGCCACGGAC TTCATTTCTC 480 

TGTGA 4B5 
What la claimed is: 
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25 



30 



preferred or less prefer™, T lMrt OM no "- 

-odin, Mid JL^: 1 !^ 0 " in °» ""ura! gTOe 

~ ,r ~. r.^ .r ein - 

Protein et . le vel „ hlch ia at^s" uo% o f "nT"" 
pressed by said natural gene In Tr. , 

Protein at'a J^lT-eTT^ £7"" 
-ture , ystM under id . n J™ » " £«- -U 

Protein at', ^^^.r^"^,^ 
expressed by said nature! gene in an in vito ^ 
culture systen under identical cnndit^n^ 
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7. The synthetic gene of claim 1 wherein at least 
10% of the codons in said natural gene are non-preferred 
codons . 

8. The synthetic gene of claim l wherein at least 
5 50% of the codons in said natural gene are non-preferred 

codons . 

9. The synthetic gene of claim 1 wherein at least 
50% of the non-preferred codons and less preferred codons 
present in said natural gene have been replaced by 

10 preferred codons. 

10. The synthetic gene of claim 1 wherein at 
least 90% of the non-preferred codons and less preferred 
codons present in said natural gene have been replaced by 
preferred codons. 

15 11. The synthetic gene of claim 1 wherein said 

protein is a retroviral or lentiviral protein. 

12 . The synthetic gene of claim 11 wherein said 
protein is an HIV protein. 

13. The synthetic gene of claim 12 wherein said 
20 protein is selected from the group consisting of gag, 

pol, and env. 

14. The synthetic gene of claim 13 wherein said 
protein is gpl20 or gpl60. 

15. The synthetic gene of claim 1 wherein said 
25 protein is a human protein. 
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16. A method for preparing a synthetic gene 
encoding a protein normally expressed by mammalian cells 
comprising identifying non-preferred and less-preferred ' 
codons in the natural gene encoding said protein and 
replacing one or more of said non-preferred and less- 
preferred codons with a preferred codon encoding the same 
amino acid as the replaced codon. 
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Syngpl20nin 

1 C7C3ACATCC AT7C7GC7C7 AAAGGAGA7A CCCGGCCAGA CACCC7CACC 
51 73CGG7GCCC AGC7GCCCAG GC7GAGCCAA GAGAAGGCCA GAAACCA7CC 
122 C3A7GGGG7C 7773CAACC3 C73GCCACC7 737ACCTGC7 SGGGATCC7G 
= TC3CTrCC = ~~ AGCCAC '3ACAAGC7G 7GGG7GACCG 737AC7ACGG 
201 C3TGCC737G 73GAACGAGG CCACCACCAC CC7G7777GC 3CCAGCGACG 
25! C=AAGGC3TA C3ACACC3AG G7GCACAACG 737GGGCCAC CCAGGCG7GC 
2 31 37GCC3ACC5 AlCCCAACCC 3CAGGAGG7G GAGC7CG73A ACG7GACC3A 
351 3AAC77CAAC A7G7G3AAGA ACAACA7CG7 G3AGCAGA73 CA73AGCACA 
4=1 7CA7CAGCC7 CTGSGACCAG AGCC73AAGC CC7GC37GAA GC7GACCC3C 
<S1 C737GC37GA C r~3AAC73 CACC3ACC75 AGCAACACCA CCAACACCAA 
5C1 CAACAGCACC VCAACAACA ACAGCAACAG C3AGGSCACC A7CAASGGCG 
S51 3CGACA7GAA CAAC7GCAGC 77CAACA7CA CCACCAGCAT CC3CGACAAC 
601 A?GCAGAACG> A37AC3CCC7 GC7G7ACAAG C7GGA7A7C3 7GAGCA7CGA 
=51 CAACGACACC ACCAGC7ACC GCC7GA7C7C C7GCAACACC AGCCTGATCA 
-I C3CAGGCC73 QCCCAACATC AGC77CGAGC C3A7CCCCA7 CCAC7AC7GC 
5. 3CCCCC3CCG qCTTCSCCAT C77GAAC73C AAC3ACAAGA AC77CAGCCG 
30: CAAGGGCAGC TGCAAGAACG T3AGCACCGT GCAGT3CACC CACGGCA7CC 
251 33CC3G7GG7 ^GCACCCAG C7CC7GC7CA ACGGCAGCCT 3GCC3AGGAG 
SCI 3AGG7GG7GA TCCGCAGCGA GAAC77CACC GACAAC3CCA AGACCA7CA7 
351 C37GCACC7G AA7GAGAGCG 73CAGA7CAA C73CACGCG7 CCCAACTACA 
10C1 ACAAGC3CAA <*CGCA7CCAC A7CGGCCCC3 GGCCC3CC77 C7ACACCACC 
1C51 AAGAACA7CA TCGGCACCAT CC3CCAGGCC CACTGCAACA 7C7C7AGAGC 
1101 CAAGTGGAAC CACACCC73C GCCAGA7C37 3AGCAAGC7G AAGGAGCAGT 
1151 7CAAGAACAA CACCA7CG7C TTCAACCAGA GCAGCGCCGG CGACCCCGAC 
1201 ATCGTGATGC ACAGC77CAA C7GCGGCGGC CAA77C77C7 AC7GCAACAC 
1251 CAGCCC3C7G TTCAACAGCA CC7CCAACGG CAACAACACC 73GAACAACA 
1301 CCACC33CAG CAACAACAAT A7TACCC7CC AG7GCAAGA7 CAAGCAGATC 
1351 A7CAACA7G7 CGCAGCAGG7 GGGCAAGCCC A7C7AC3CCC CCCCCA7CGA 
-<01 GGGCCAGA7C CGG7GCAGCA CCAACA7CAC C3G7C7CC7C C7GACCCCCG 

1451 ACGCCCGCAA CGACACCGAC ACCAACGACA CCGAAA7C77 CCGCCCCGGC ' <* ) 
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=GCCGC=ACA T5C3C3ACAA CTSSACA'C "A „. 

• r __ -«GA.CT oA«v. tG TACA A5TACAACCT 

— 51 GGTCACSATC i^CC ZC 

— C= CACCAACGCC AACC3C 

-aw- T5CTCCAGCC CJACAAGC3C TAAAGCGGC" •*«■ / <r^ 

L sea i o wo 
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""" kZ "=7AC7AC CKrs.~~ 7373CAAGCA 
: 3C3CCA3CGA Cr,CCAAGCC3 7AC3ACAC33 

= —ZACZ—ZT GAACSTKArc 3AGAAC777A ACA7373GAA 

::: gaacaaca73 ctscagcasa t^a^agga catcatcagc ~37gggacc 

w-.^AA t, 5 AA5CT3ACCC 37.77373C37 GACC77CAAC 

3 31 TSCACTSA.-.- T-AGCAACAC CACCAACACC AACAACACCA 7C3CCV.CAA 
3 = 1 CAACAGCAAC A3C3AC3CCA CCA77AAGG3 C33C3ACA73 AAGAAC73CA 
""~- AACA " CA-A-ACr A?C"~ACA ACATCCAGAA SSAtflACSCC 
•I;". -75C737ACA A3C733A7A7 C3T3ACCATC CACAAC3.\C A UCACCA3C7A 

A "~=~=~ ^CAAC3A-V. JAAGrrCAGC 3GCAAG3GCA 3C7GCAAGAA 
»j.j/\w^rv«. . - - --A CC3AC33CA7 CC3GCC3G75 3T3AGwACCC 
.-».». 7C73C7 QAA^-JSCAGC 3733CS3AG3 AGCAGGTSG7 ~irrrr.C\ZC 
Z: - Z '^~-- A C33ACAAC3C 3AAGAC3A7C A7C37SCACC 7SAA73AGA0 
SOI C373CA3A7C AAC7CCAC5C 373CCAAC7A CAACAAGC5C AAGC3CA733 
351 ACATCSSSCr C3G3C3C3CC 7737ACACCA CCAAGAACA7 CA7C3GCACC 
- ■ C"AC7CCAA CA7r777AGA 3C3AAG733A A33ACACC77 
'' Z:ZZZ "~"~ : (pSAGCAASC 73AAGGAGCA 377CAASAAC AASACCA73S 
-.r: 7GT7 CAACCA GASrAGCGGC 33G3ACCCC3 A3A7C373AT GCACACC77C 

HI- .-ACC73GAAC QGCAACAACA C373GAACAA cac--.c-.gc AGCAACAACA 
"7A77ACC77 C-AG73CAAG A7CAA3CAGA ;CA7CAACA7 373GCA3GA3 
•-231 37GGGCAAGG CCA737ACGC CZCCCCCATC -AGG3CCA3A 7"ZZ~ZZZG 
1251 CAGCAACA7C AC===7~,C TGC73ACC33 C3AC3GCGGC .ACT.ACTACCG 
---- n^AZZ .*-*. Z 'J A CACC5AAA7C ZZZZZZZZZZ 3C3GC"C~A CA*"3"'"*** — *C 
AA*.-73GAGA7 C73AGC737A CAAG7ACAA.-. ~3G7CAC3A 7C3A3C33— 
-1.- ^w.w::: CGCACCAACC ?CAA"GCC3 C373G73CA3 C3C3ACAAGC 
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1451 3GGCC3CCAT Z J3C3CCCT j 7TCC7SCGC7 TC 

-501 ACCA7CGGGG CJ3CCAGCCT 3ACCCTGACC GTGCAGGCCC GCCTCCTCCT 

ISS: GAGCGGCATC 3TCCAGCAGC AGAACAACCT CCTCCGCGCC A7CGAGGCGC 

1531 AGCAGCATAT QrTCCAGCTC ACC37G7GGG CCA7CAAGCA GC7C3AGGCC 

rTGGGGGTGC 72C3GCAAGC 73A7C7GCAC CAC1ACGG7A CCC73GAACG 
C77CC7GGAG CVACAAGAGC C73GACGACA 7G73GAACAA CA7GACC7 3G 
l=:i A7GCAG7GGG AGCGCGAGA7 33A7AAC7AC ACCAGCC73A 7C7ACAGC77 
1 = 51 3C7GGAGAAG A3CCAGACCC AGCAGGAGAA GAACGAGCAG GAGC7GC7GG 
13 CI AGC7CGACAA C?GGGC3AGC C7G7GGAAC7 3G77C3ACA7 CACCAAC73G 

rrGCArccrc T7C3cc37:c tcaccatggt 3aacc3cg73 c3ccaggg~ 

33CCCC3AGG GpCATCGAGGA GGAGGGCGGC 3AGC3CGACC 3C3ACACCAG 
33GCAGGC7C Q7GCAGGGC7 73C73GCGA7 CA7C7GGG7C 3ACC7CCSCA 

*2S1 C3CA7C37GG AAC7CC7AGG CC3CCCCGGC TGGGAGGT3C 7GAAG7AC7G 
:30i 37GGAAC373 C7CCAG7A77 3GAGCCAGGA GC7GAAG7CC AGCGCC37GA 
::5: 3C773C7GAA C3CGACC3C3 A7CGCCG7GC CGGAGCGCAC C3ACC3C373 
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FIGURE 2 
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FIGURE 5 
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